Factored Soft Source Syntactic Constraints for Hierarchical Machine Translation

نویسندگان

  • Zhongqiang Huang
  • Jacob Devlin
  • Rabih Zbib
چکیده

This paper describes a factored approach to incorporating soft source syntactic constraints into a hierarchical phrase-based translation system. In contrast to traditional approaches that directly introduce syntactic constraints to translation rules by explicitly decorating them with syntactic annotations, which often exacerbate the data sparsity problem and cause other problems, our approach keeps translation rules intact and factorizes the use of syntactic constraints through two separate models: 1) a syntax mismatch model that associates each nonterminal of a translation rule with a distribution of tags that is used to measure the degree of syntactic compatibility of the translation rule on source spans; 2) a syntax-based reordering model that predicts whether a pair of sibling constituents in the constituent parse tree of the source sentence should be reordered or not when translated to the target language. The features produced by both models are used as soft constraints to guide the translation process. Experiments on Chinese-English translation show that the proposed approach significantly improves a strong string-to-dependency translation system on multiple evaluation sets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Preference Grammars and Soft Syntactic Constraints for GHKM Syntax-based Statistical Machine Translation

In this work, we investigate the effectiveness of two techniques for a featurebased integration of syntactic information into GHKM string-to-tree statistical machine translation (Galley et al., 2004): (1.) Preference grammars on the target language side promote syntactic wellformedness during decoding while also allowing for derivations that are not linguistically motivated (as in hierarchical ...

متن کامل

Soft Syntactic Constraints for Hierarchical Phrase-Based Translation Using Latent Syntactic Distributions

In this paper, we present a novel approach to enhance hierarchical phrase-based machine translation systems with linguistically motivated syntactic features. Rather than directly using treebank categories as in previous studies, we learn a set of linguistically-guided latent syntactic categories automatically from a source-side parsed, word-aligned parallel corpus, based on the hierarchical str...

متن کامل

A Unified Model for Soft Linguistic Reordering Constraints in Statistical Machine Translation

This paper explores a simple and effective unified framework for incorporating soft linguistic reordering constraints into a hierarchical phrase-based translation system: 1) a syntactic reordering model that explores reorderings for context free grammar rules; and 2) a semantic reordering model that focuses on the reordering of predicate-argument structures. We develop novel features based on b...

متن کامل

Learning Hierarchical Translation Spans

We propose a simple and effective approach to learn translation spans for the hierarchical phrase-based translation model. Our model evaluates if a source span should be covered by translation rules during decoding, which is integrated into the translation system as soft constraints. Compared to syntactic constraints, our model is directly acquired from an aligned parallel corpus and does not r...

متن کامل

Improving statistical machine translation with linguistic information

Statistical machine translation (SMT) should benefit from linguistic information to improve performance but current state-of-the-art models rely purely on data-driven models. There are several reasons why prior efforts to build linguistically annotated models have failed or not even been attempted. Firstly, the practical implementation often requires too much work to be cost effective. Where ad...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013